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RECOMBINANT CALF INTESTINAL ALKALINE PHOSPHATASE 



BACKGROUND OF THE IHVENTTpw 

The present invention relates to recombinant calf 
intestinal alkaline phosphatase and more particularly to 
5 isolated nucleic acids encoding the recombinant form of 
calf intestinal alkaline phosphatase. 



10 



15 



20 



phosphatases (APs) are a family of 
functionally related enzymes named after the tissues in 
which they predominately appear. Such enzymes carry out 
hydrolase/transferase reactions on phosphate-containing 
substrates at a high pH optimum. The exact role of APs in 
biological processes remains poorly defined. 




In humans emd other higher animals, the AP 
contains four members that are each encoded by a separate 
gene locus as reviewed in Millan, Anticancer R«.s. 8:995- 
1004 (1988) and Harris, Clin. Chem. Aoi-^ 186:133-150 
(1989). The alkaline phosphatase family includes the 
tissue specific APs (placental AP, germ cell AP and 
intestinal AP) and the tissue non-specific AP found 
predominately in the liver, bone and kidney. 



Intestinal alkaline phosphatase (lAP) derived 
from humans has been extensively characterized. As with 
all known APs, human lAP appears as a dimer, which is 
referred to as p75/150 in Latham & Stanbridge, p.n.A.S. 

25 xas^ 87:1263-1267 (1990). A cDNA clone for human adult 
lAP has been isolated from a Agtll expression library. 
This cDNA clone is 2513 base pairs in length and contains 
an open reading frame that encodes a 528 amino acid 
polypeptide as described in Henthorn et al . , P.N.A.S. rnsA) 

30 84M234-1238 (1987). lAP has also been found in other 
species, such as mice, cows, and fish as reported in McComb 
al.. Alkaline PhQsph«ta««'« (Plenum, New York, 1989). 
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Generally, alkaline phosphatases are useful 
diagnostically in liver and bone disorders as described in 
McComb et al., supra > or for certain cancers as reviewed in 
Millan, Proa- Clin. Biol. Res> . 344:453-475 (1990). APs 
5 are also useful as reagents in molecular biology. Of the 
known APs, bovine ZAP has the highest catalytic activi£y. 
This property has made bovine lAP highly desirable for such 
biotechnological applications as enzyme-conjugates for use 
as diagnostics reagents or dephosphorylation of DNA, for 
10 example. 

The isozymes of bovine lAP (b.IAP), including 
calf lAP, adult bovine lAP, and a tissue non-specific 
isozyme extracted from the small intestines, have been 
characterized by Besman & Coleman, J. Biol> Chem. . 

15 260:1190-1193 (1985). Although it is possible to purify 
naturally-occurring calf lAP extracted from intestinal 
tissues, it is technically very difficult to obtain an 
enzyme preparation of reproducible quality and purity. 
Generally, the enzymes are extracted from bovine intestines 

20 obtained from slaughter houses. Since the sacrificed 
animals are not of the same age, the proportion of the 
known b.IAP isozymes will vary significantly among the 
purified extracts. 

Moreover, the intestine is known to contain high 
25 amounts of peptidases and glycosidases that degrade the 
naturally occurring ZAP. Since the time from slaughter to 
enzyme extraction varies greatly, the amount of degradation 
will also vary greatly, resulting in a mixture of intact 
and several degradation products. Accordingly, the known 
30 methods of purifying lAP from naturally-occurring sources 
produce microheterogeneity in the purified lAP 
preparations. These partially degraded lAP molecules are 
technically difficult to separate from the native intact 
lAP molecules. 
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Due in part to the technical problems of 
separating intact b.IAP from degraded or partially 
processed calf lAP and the minute quantities of purified 
intact b.IAP that can be obtained from naturally-occurring 
5 sources, it has been difficult to determine the amino acid 
sequence encoding calf lAP. in addition, attempts to 
crystalize the lAP protein to determine the three- 
dimensional structure from the natural source has been 
hampered because of such microheterogeneity of the enzyme 
10 obtained from natural sources. It has only been possible 
to obtain small crystals of the natural enzyme, which are 
of insufficient quality for crystallographic studies. 

Thus, a need exists for a homogeneous source of 

calf intestine alkaline phosphatase. Such a source would 

15 ideally provide an ample supply of pure, intact calf lAP 

for research and commercial use without time-consuming and 

labor intensive procedures. The present invention 

satisfies this need and provides related advantages as 
well. 

20 SUMMARY OF THE INVENTION 

The present invention generally relates to 
recombinant calf intestinal alkaline phosphatase (calf lAP) 
having an amino acid sequence substantially the same as 
naturally occurring calf lAP or its active fragments. The 
25 invention further provides isolated nucleic acids encoding 
such polypeptides. Vectors containing these nucleic acids 
and recombinant host cells transformed or transfected with 
such vectors are also provided. 

Nucleic acid probes having nucleotide sequences 
30 complementary to a portion of the nucleotide sequence 
encoding calf lAP are also provided. Such probes can be 
used for the detection of nucleic acids encoding calf lAP 
or active fraqments 



wo 93/18139 



PCr/US93/02172 



4 

The present invention further provides a 
multifunctional polypeptide containing an amino acid 
sequence of calf lAP and a second amino acid sequence 
having specific reactivity with a desired ligand. The 
5 second amino acid sequence can encode, for exan^le, an 
antibody sequence when the desired ligand is an antigen. 

The pure recombinant polypeptides of the present 
invention r including the multifunctional polypeptides, are 
particularly useful in methods for detecting the presence 
10 of antigens or other ligands in substances, such as fluid 
samples and tissues. Such diagnostic methods can be used 
for in vitro detection of such ligands. . 

BRIEF DESC RIPTION OF THE DRAWTTJGg 

Figure 1 (SEQ ID NO: 9) shows the full length 
15 genomic sequence of calf lAP and the deduced amino acid 
sequence . 



Figure 2 shows the restriction map of the entire 
calf lAP gene and the full length cDNA. 

Figure 3 (SEQ ID NOS: 10-13) shows a comparison 
of lAPs from calf (b.IAP) , rat (r.IAP), mouse (m.IAP), and 
human (h.IAF) . 



Figure 4 shows the results of studies relating to 
the heat inactivation of purified and recombinant calf lAP. 

DETAILED DTCSP- RIPTIQN OF THE TNVEMTION 

The present invention relates to the elucidation 
of the calf intestinal alkaline phosphatase gene. More 
specif icklly, the invention relates to the nucleotide 
sequence of the region of the gene encoding the enzyme. 
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Previous attempts to produce a full length cDNA 
or a complete genomic clone for calf lAP have been 
unsuccessful . RNA extracted from bovine intestinal tissues 
are not fully processed (i.e., incompletely spliced RNA) or 
5 are quickly degraded after death. As such, only fragments 
of the genome coding region could be obtained. 

It was through the extensive experimentation as 
set forth in the examples below that the full length cDHA 
clone of calf lAP was determined. Accordingly, the present 
10 invention is directed to isolated nucleic acids comprising 
the nucleotide sequence encoding calf lAP or an active 
fragment thereof having the enzymatic activity of the 
intact calf lAP. The nucleic acids can be DNA, cDNA or 
RNA. 

nucleic acid can have the nucleotide sequence 
substantially the same as the sequence identified in Figure 
1, which shows the complete coding region of the genomic 
sequence of calf lAP. This nucleic acid (5.4 kb) contains 
11 exons separated by 10 small introns at positions 
20 identical to those of other members of the tissue-specific 
AP family. Additionally, a 1.5 kb of the 5' sequence 
contains putative regulatory elements having homology to 
human and mouse lAP promoter sequences. 

As used herein, the term "substantially the 
25 sequence" means the described nucleotide or amino acid 
sequence or other sequences having one or more additions, 
deletions or substitutions that do not substantially affect 
the ability of the sequence to encode a polypeptide having 
a desired activity, such as calf lAP or its active 
30 fragments. Thus, modifications that do not destroy the 
encoded enzymatic activity are contemplated. 

As used herein, an active fragment of calf lAP 
refers to portions of the intact enzyme that substantially 
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retains the enzymatic activity of the intact enzyme. The 
retention of activity can be readily determined using 
methods known to those skilled in the art. 

The terms "isolated" and "substantially purified" 
5 are used interchangeably and mean the polypeptide or 
nucleic acid is essentially free of other biochemical 
moieties with which it is normally associated in nature. 
Recombinant polypeptides are generally considered to be 
substantially purified, 

iO The present invention further relates to 

expression vectors into which the coding region of the calf 
1KB gene can be subcloned. "Vectors" as used herein are 
capable of expressing nucleic acid sequences when such 
sequences are operationally lined to other sequences 

15 capable of effecting their expression. These expression 
vectors must be replicable in the host organisms either as 
episomes or as an integral part of the chromosomal DNA. 
Lack of replicability would render them effectively 
inoperable. In general, useful vectors in recombinant DNA 

20 techniques are often in the form of plasmids,. which refer 
to circular, double stranded DNA loops which are not bound 
to the chromosome in their vector form. Suitable 
expression vectors can be plasmids such as, for example, 
pcDNAl (Invitrogen, San Diego, CA) . 

25 A number of procfiuryotic expression vectors are 

known in the art, such as those disclosed, for example, in 
U.S. Patent Nos. 4,440,859; 4,436,815; 4,431,740; 
4,431,739; 4,428,941; 4,425,437; 4,418,149; 4,411,994 and 
4,342,832, all incorporated herein by reference. 

30 Eucaryotic systems and yeast expression vectors can also be 
used as described, for example, in U.S. Patent Nos. 
4,446,235; 4,443,539; and 4,430,428, all incorporated 
herein by reference. 



wo 93/18139 



PCT/US93/02172 



10 



20 



25 



The vectors can be used to transfect or transform 
suitable host cells by various methods known in the art, 
such as described in Sambrook et al.. Molecule,- non^»^. ^ 
I^aboratory Manual, Cold Spring Harbor, NY (1989). Such 
host cells can be either eucaryotic or procaryotic cells. 
Examples of such hosts include Chinese hamster ovary (CHO) 
cells, E.Coli and baculovirus infected insect cells. As 
used herein, "host cells" or "recombinant host cells" refer 
not only to the particular subject cell but to the progeny 
or potential progeny of such cell. Because certain 
modifications may occur in succeeding generations due to 
either mutation or environmental influences, such progeny 
may not, in fact, be identical to the parent cell, but are 
still included within the scope of the term as used 



present invention further relates to 
recombinant proteins or polypeptides produced by the 
recombinant host cells of the present invention. The 
recombinant calf lAP protein has been characterized in 
terms of its heat stability up to about 50«C, 
electrophoretic and isoelectric focusing (lEF) behavior and 
kinetic parameters. The recombinant calf lAP protein of 
the present invention demonstrated displayed kinetic 
properties comparable to commercially available purified 
calf lAP, while showing less heterogenicity than the 
commercial enzymes in polyacrylamide gel electrophoresis 
and lEF, as described in the examples below. 



Hethods for obtaining or isolating recombinant 
calf lAP or active fragments are also provided. Such 
methods include culturing the recombinant host cells in a 

30 suitable growth medium. The protein or active fragments 
can thereafter be isolated from the cells by methods known 
in the art. if the expression system secretes calf lAP 
protein into growth media, the protein can be purified 
directly from cell-free media. If the protein is not 

35 secreted, it can be isolated from cell lysates. The 
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selection of the appropriate growth conditions and recovery 
methods are within the knowledge of one skilled in the art. 
Recombinant calf ZAP or active fragments thereof can be 
unglycosylated or have a different glycosylation pattern 
5 than the native enzyme depending on the host that is used 
to prepare it. 

The present invention further provides isolated 
nucleic acids containing a nucleotide sequence encoding 
calf lAF or an active fragment thereof and a second 

10 nucleotide sequence encoding a polypeptide having specific 
reactivity with a ligand. Such nucleic acids encode a 
chimeric or multifunctional polypeptide in which a region 
of the polypeptide has enzymatic activity conferred by the 
calf lAP sequence attached to a second region having 

15 specific reactivity with a particular ligand • Such 
multifunctional polypeptides are particularly useful in 
diagnostic assays for determining the presence or 
concentration of a particular ligand in a sample. The 
ligand can be, for example, a ccuicer mcirker, allergen, drug 

20 or other moiety having an ability to specifically bind with 
an antibody or antibody- like agent encoded by a 
multifunctional polypeptide of the present invention* For 
instance, the second nucleotide sequence can encode an 
anti-CEA antibody when the target ligand is CEA 

25 (carcinoembryonic antigen) . The ligand can also be a 
fragment of DNA or other nucleic acids. 

Nucleic acid probes specific for a portion of 
nucleotides that encode calf ZAP can be used to detect 
nucleic acids specific to calf LAP for diagnostic purposes. 

30 Nucleic acid probes suitable for such purposes can be 
prepared from the cloned sequences or by synthesizing 
oligonucleotides that hybridize only with the homologous 
sequence under stringent conditions. The oligonucleotides 
Ccui be synthesized by any appropriate method, such as by an 

35 automated .DNA synthesizer. 
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The oligonucleotides can be used to detect DNA 
and mRNA or to isolate cDNA clones from libraries. The 
particular nucleotide sequences selected are chosen so as 
to correspond to the codons encoding a known amino acid 
sequence from the protein. Generally, an effective length 
of a probe is recognized in the art is about 14 to about 20 
bases. Longer probes of about 25 to about 60 bases can 
also used. A probe can be labelled, using labels and 
methods well known in the art, such as a radionucleotide or 
biotin, using standard procedures. 



The purified recombinant calf lAP or its active 
fragments can be used for diagnostic purposes to determine 
the presence or concentration of a ligand in a sample. The 
sample can be a fluid or tissue specimen obtained, for 
15 example, from a patient suspected of being exposed to a 
particular antigen or DNA fragment. Those skilled in the 
art will recognize that any assay capable of using an 
enzyme-catalyzed system can be used in the detection 
methods of the present invention. 

2° In the detection methods of the present 

invention : 

(a) a sample is contacted with the recombinant 
calf lAP or an active fragment thereof attached to a 
reagent specifically reactive with the ligand to be 

25 detected; 

(b) the sample is contacted with a detectable 
agent catalyzed by calf lAP; and 

(c) the binding of the sample to the reagent is 
detected, where binding indicates the presence of the 

30 ligand in the sample. 

The methods can also be used to determine the 
concentration of a ligand in the sample by relating the 
amount of binding to the concentration of the ligand. To 
determine the concentration, the amount of binding can be 
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compared to known concentrations of the ligand or to 
standardized measurements, such as slopes, determined from 
known concentrations of the ligand. 

A variety of ligands can be detected by the 
5 present methods. The ligand can be, for example, a protein 
or polypeptide having antigenic properties or a nucleic 
acid, such as DNA or RH&. 

Reagents reactive with such ligands can be 
antibodies or reactive fragments of such antibodies when 
10 the ligand is an antigen or antigen- like molecule. The 
reagent can also be a nucleotide probe that hybridizes or 
binds to a specific nucleic acid, such as DNA or RNA. Such 
probes can be oligonucleotides that are complementary to 
cDNA or genomic fragments of a ligand. 

Procedures for attaching the enzymes to various 
reagents are well known in the art. Techniques for 
coupling enzymes to antibodies, for example, are described 
in Kennedy et al., Clin, Chim, Acta 70:1 (1976), 
incorporated herein by reference. Reagents useful for such 
coupling include, for example, glutaraldehyde , p- toluene 
diisocyanate , various carbodiimide reagents, p-benzoquinone 
m-periodate, N,N'-o-phenylenediamalemide and the like. 
Alternatively, the multifunctional polypeptides of the 
present invention can be used. 

Suitable substrates for the biochemical detection 
of ligands according to the methods of the present 
invention include, for example, p-nitrophenylphosphate. 

The recombinant form of calf lAP is also useful 
for the development of calf lAP having greater heat 
stability. By site directed mutagenesis, it is possible to 
modify the nucleic acid sequence encoding for the 



15 



20 



3. 
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recombinant protein to obtain a heat stable calf lAP 
comparable to human placental lAP, which is known to be 
stable at about 65 •C. Greater heat stability would allow 
the use of such a modified calf lAP in procedures requiring 
5 higher heating, such as Southern blotting, for example, 
which generally denatures many enzymes. 

The following examples are intended to illustrate 
but not limit the invention. 



15 



EXAMPLE I 
Libraries and ScreenirK 

Initially, a Agtll cDNA library prepared from 
adult bovine intestine (Clontech Laboratories, Palo Alto, 
CA) was screened using a mouse lAP cDNA fragment described 
in Manes et al.. Genomics 8:541-554 (1990) as a probe. A 
2.1 kb unprocessed cDNA fragment and a 1.1 kb processed 
cDHA fragment, both isolated from this library, were used 
to screen a genomic library prepared from adult cow liver 
in EMBL3 SP6/T7 (Clontech Laboratories, Palo Alto, CA) . 
Radiolabelling of probes with "P and identification and 
isolation of positive clones was done as described in Manes 
et al., supra , which is incorporated herein by reference. 
Large-scale phage DNA preparation was performed as 
described in Sambrook et al., supra, incorporated herein by 



20 



Initially, one positive cDNA clone was obtained 
upon screening the Agtll cDNA library with the mouse lAP 
CDNA fragment. Sequencing from the ends of the 2.1 kb cDNA 
fragment (R201) revealed an incomplete cDNA encoding exons 
VI through XI of an alkaline phosphatase gene as identified 
30 by sequence comparison to known AP genes. This cDNA 
fragment included all introns and revealed several STOP 
codons as well as two frameshifts in the putative coding 
region of the gene. 
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Although further sequence information of R201 
suggested that it is possibly transcribed from a 
pseudogene, it was used as a probe for further screening of 
the Xgtll library • Two additional cDHA clones were 
5 subsequently isolated and identified as transcripts of 
another alkaline phosphatase gene. Again, one fragment of 
0.8 kb length (BB203) turned out to be reverse transcribed 
from an incomplete and unprocessed RNA, whereas the other 
one, a cDNA fragment of 1.1 kb length (BB204), was derived 
10 txom a partial but processed mRNA, extending from the end 
of exon V through exon XI, lacking a putative poly- 
adenylation site and a poly-A tail. 

EXAMPLE II 

Characterization of Genomic Clones and Sequence Analysis 

15 Genomic DNA was isolated from adult cow liver and 

Southern blot analysis was performed using standard 
protocols as described in Sambrook et al«, supra . 
Restriction enzymes were obtained from Gibco BRL, 
Boehringer Mannheim, and New England Biolabs. Twenty pg of 

20 genomic DNA were used per reaction. The blots were probed 
with the 2.1 kb unprocessed cDHA fragment, and washed under 
high stringency conditions (0.1 x SSC at 65^C)« 

Two bands in the genomic Southern were identified 
as fragments derived from the b.IAP gene. The only other 

25 non-human mammalian genome investigated extensively for 
tissues specific (TSAP) genes so far has been the murine 
genome, as reported in Manes et al., supra . Two murine 
TSAP genes, one tejnned embryonic AP (EAP) , the other coding 
for lAP, and a pseudogene were cloned. In previous 

30 studies, it was shown that there are two TSAP genes 
expressed in the bovine genome according to Culp et al • , 
Biochem. Biophvs . Acta 831:330-334 (1985) and Besman & 
Coleman, supra . Similarly, two APs have been found 
expressed in the adult intestine of mice as reported in 
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Hahnel et al., Deve3,opn>e|it. 110:555-564 (1990). Expression 
of AP in rat intestine appears to be even more complex 
(Ellakim et al.. Am. J. P hvsiol. 159, l.l:G93-98 (1990)). 
Identification of the b.IAP gene was possible by comparison 
5 of its deduced amino acid sequence with N-terminal 
sequences reported for both TSAP isozymes. 

Since further screening of the cDNA library 
revealed no additional positive clones, both R201 and BB204 
were used to screen an EMBL3 SP6/T7 genomic library. Three 
positive clones were obtained and analyzed by Southern 
blotting. Subsequent sequencing of several fragments from 
two of the clones showed that one contained the entire 
coding region for the b.IAP gene as identified by 
comparison of deduced amino acid sequence with sequences 
previously determined in Culp et al., supra and Besman & 
Coleman, supr^ . A 5.4 kb sequence from overlapping Hind 
III and BamHl fragments of the clone containing the b.IAP 
gene are presented in Figure 1. The other clone contained 
sequences identical (except for a few basepair changes) 
20 with R201. 



10 



15 



25 



Genomic clones were 
were determined as described 
Nucleic acid and protein sequence 
analyzed using the MacVector 
( IBI , New Haven , CT ) . 



and 

in Manes et al 



supra . 
and 



program 



EXAMPLE III 
PGR Mutagenesis and Subrilonina intn dcDNA 



30 



A 23-mer primer ("MKNHE" (SEQ ID NO: 1):5'- 
GCTAGCCATGCAGGGGGCCTGCG-3 ' (SEQ ID NO: 2)) was used to 
amplify base pairs 1497-1913 of the b.IAP gene which had 
been subcloned as a Hind III/BamHl fragment into 
Bluescript-KS+ (Stratagene, San Diego, OA). MKNHE (SEQ ID 
NO: 1) had been designed to create a new Nhe I site by 
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al-terlng the three 5' nucleotides of the primer sequence 
compared to the genomic sequence to allow the easy 
subcloning into different expression vectors* The 
universal SK primer was used as complementary reverse 
5 primer in the performed polymerase chain reaction (PCR) • 
The plasmid was heat denatured, annealed to the primers and 
subjected to 30 cycles of PCR amplification in an Automatic 
Thermocycler (MJ Research, Fiscataway,. NJ) • Times and 
temperatures were set as follows: annealing at 40 for 30 

10 seconds, extension for 3 minutes at 72 ""C and denaturing at 
95 for 30 seconds. The amplified fragment was directly 
subcloned into the "T-modified" EcoRV site of Bluescript as 
described in Maurchuk et al,, Nuc 1 . Acids Res . 19:1154 
(1990), incorporated herein by reference, in the 

15 orientation of b-galactosidase transcription. 

EXAMPLE TV 

Sequencing of the Amplified Fracnnent 

The amplified fragment was sequenced using the 
universal T3 cLnd T7 primers in the Sanger dideoxy chain 

20 termination procedure as described in Sanger et al., Proc* 
Natl. Acad, Sci. U.S.A. 74:5463-5467 (1977), which is 
incorporated herein by reference, to exclude the 
possibility of secondary mutations. The Hind IIl/BamHl 
fragment was used together with a 3.2 kb BamHl/Smal 

25 fragment of the b.IAP gene for directional subcloning into 
a Hind III /EcoRV opened pcDNA 1 expression vector 
(Invitrogen, San Diego, CA) • 

EXAMPLE V 
Recombinant Expression of b.IAP 

30 The b.IAP gene subcloned into pcDNA 1 was 

transfected into Chinese hamster ovary (CHO) cells, ATCC 
No. CCL61, by means of Ca^"^ coprecipitation as described in 
Hummer and Millan, Biochem. J. 274:91-95 (1991), which is 



10 
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incorporated herein by reference. The recombinant protein 
was extracted with butanol after incubating for 2 days . 

The b.IAP gene presented in Figure 1 includes an 
open reading frame (ORP) of 2946 bp, containing 11 exons 
5 and 10 introns of very compact nature. Exon and intron 
borders were determined by comparison with BB204 and other 
known AP genes described in Manes et al., supra . Hernthorn 
et al., J . Biol . Ch^m . 263:12011-12019 (1988), Knoll et 
J. Biol. Chem. 263:12020-12027 (1988), and Millan & 
Manes, Proc. Natl. Acad . Sci. DSA 85:3025-3028 (1988). A 
translation initiation codon ATG was identified by sequence 
coiq>arison to known TSAP genes and is preceded by an in- 
frame STOP codon 48 bp upstream. The ORF, which is 
terminated by the STOP codon TAA, codes for a peptide of 
15 533 amino acids in length. The mature protein of 514 amino 
acids with a calculated of 64,400 Da is preceded by a 
hydrophobic signal peptide as is the case for all known 
APs. 

The predicted eunino acid sequence of the b.IAP 
20 protein is highly homologous to other known lAPs as shown 
in Figure 3. As shown in Figure 3 there is identity in 
those parts corresponding to the partial amino acid 
sequences previously determined for b.IAP (Culp et al., 
supra ; Besman and Coleman, supra \ . Besman & Coleman 
25 determined N-terminal amino acid sequences for two 
differentially expressed AP isozymes. The 16 N-terminal 
amino acids determined for the isozyme found only in 
newborn calves differ in three or four residues from the N- 
terminus of the enzyme exclusively expressed in adults. 

30 EXAMPLE VI 

Reverse Transcriptase-PCR 

In order to construct a full length cDNA, reverse 
transcriptase-PCR (RT-PCR) was performed as follows: total 
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RRA from a staJsle transfected CHO*ceIl clone (M2) was 
isolated by acid guanidium thiocyanate-phenol-chloroform 
extraction as described in Chomozynski & Sacchi/ Anal* 
Biochem, 162:156-159 (1987), incorporated herein by 
5 reference. The reverse transcriptase reaction was 
conducted according to the protocol of the manufacturer 
(Fromega, Wisconsin) using 10 jdg of RNA. 

The reaction mixture was extracted with phenol- 
chloroform, precipitated with ethanol and resuspended in 

10 Tag pol3^erase buffer. The subsequent PGR was performed 
over 35 cycles of amplification following an initial 
denaturation at 94 for 5 minutes, annealing at 55^0 for 
30 seconds and extension at 72 for 5 minutes. The Tag 
Polymerase was added to the reaction laixture after 

15 denaturation only. The subsequent PGR settings were: 
denaturation at 94 for 45 seconds, annealing at 55 for 
1 minute and extension at 72 ^C for 4 minutes. The primers 
used for this reaction were idKNHE (SEQ ID NO: 1) and 
sequencing primer UP6: TGGGCCGCCT6AAGGA6C (SEQ ID NO: 3) 

20 (see Figure 2) . 

The sequencing strategy as well as a restriction 
map and the genomic structure of the b.IAP gene are shown 
in Figure 2. The strategies for subcloning the coding 
region of the gene into an expression vector using PGR and 
25 for construction of a full length cDNA by means of RT-FCR 
are indicated in Figure 2. A single fragment of 
approximately 830 bp length had. been obtained from RT-FCR 
as could be expected from the genomic sequence. 

EXAMPLE VIII 

30 Characterization of Recombinant Calf lAP 

The sequence for the calf intestinal AP gene was 
determined as described above. A full length cDNA was 
constructed using a partial cDNA clone (BB204) and a 
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fragment obtained by RT-PCR* 
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A cDNA fragment clone (R201) and a corresponding 
genomic clone were obtained, which resemble properties of 
a putative pseudogene. Both clones contain STOP codbns 
5 within the coding region and several frameshifts. Bands 
corresponding to the putative pseudogene could only be 
identified upon hybridizing with a mouse TNAP cDNA which 
gave a distinct pattern. This result suggests that the 
bands correspond to TSAP genes only, and that the 
10 pseudogene is more related to TNAP. In contrast, the 
murine pseudogene has been found to resemble more homology 
to the mouse EAP gene (Manes et al., supra ) . 



The sequence and genomic structure of the b.IAP 
gene show high homology to all known TSAP genes. The 

15 smallest exon, exon VII, is only 73 bp long while the 
longest exon, exon XI, is approximately 1.1 kb long. The 
exact length of exon 11 cannot be determined since no cDNA 
with a poly-A tail had been isolated. The estimate given 
is based on the identification of a putative poly- 

20 adenylation site AATAAA (bp 5183-5188) in the 3' non-coding 
region of the gene (underlined in Figure 1). The introns 
are among the smallest introns reported (Hawkins, Nucl. 
Acids Res . 16:9893-9908 (1988)) as was found in the case of 
other TSAP genes as well (Manes et al., supra ; Hernthorn et 

25 al., supra ; Knoll et al., supra ; Millan and Manes, supra ) . 
The largest one, splitting exon V and exon VI, is only 257 
bp long. All exon-intron junctions conform to the GT-AG 
rule (Breathnach et al., Proc. Natl. Acad. Sci. USA 
75:4853-4857 (1978)) and also conform well to the consensus 

30 sequences (C/A)AG/GT(A/G)AGT (SEQ ID NO: 4) and 
(T/C)^N(C/T)AG/G (SEQ ID NO: 5) for donor and acceptor 
sites, respectively (Mount, Nucl , Acids Res , 10:459-473 
(1982) ) . 



Interestingly, the entire coding region of exon 
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XI shows a high 6/C content of over 60 to 80% compared to 
a rather equal ratio of G/C to A/T throughout the whole 
structural gene* Other regions of biased GC content were 
found at bp 270 to bp 490 with a high A/T content and in a 
5 region preceding the poly adenylation site, which again 
shows a high G/C content. 

A putative TATA-box has been identified in the 
1.5 kb of sequence preceding the coding region (bp 1395- 
1400, underlined in Figure 1). It shows the same variant 
10 ATTT2A sequence embedded in a conserved region of 25 bp as 
was previously reported for the mouse TSAP genes (Manes et 
al«, supra) and two human TSAP genes (Millan, Nucl , Acids 
Res. 15:10599 (1987); Millan and Manes, supra)). 

The sequence GGGAGGG has been shown to be part of 
15 the putative mouse TSAP promoters (Manes et al., supra ^ as 
well as of two human TSAP promoters (Millan, (1987), supra ; 
Millan and Manes, supra) • This sequence is also present in 
the putative promoter region of the b.IAP gene. 

The sequence CACCC or its complementary reverse 
20 is repeated 6 times in the region of bp 1182-1341, 24 times 
in the entire structural gene and 31 times throughout the 
whole sequence shown here. However, only one less 
conserved CACCC box (Myers et al.. Science 232:613-618 
(1986)) was identified. 

25 Since it was shown for dog lAP that the enzyme 

can be induced by cortico steroid hormone (Sanecki et al.. 
Am, J, Vet, Res. 51, 12:1964-1968 (1990)), hormone 
responsive elements in the genomic sequence of b.IAP were 
identified. Palindromic and direct repeats, known to be 

30 binding sites for dimeric nuclear factors as described in 
O'Malley, Mol. Endocrinol. 5:94-99 (1990), were identified 
in the 1.5 kb upstream of the initiation codon. A long, 
imperfect palindromic repeat (CACACCTCCTGCCCAG-N7- 
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CTGGTGAGGAGCTGAG) (SEQ ID NO: 6) extends from bp 899 to bp 
937. A direct repeat of the sequence 6GGCAG6 spaced by 
three nucleotides starts at bp 1311. 

Several regions of high homology to mouse (Manes 
5 et al.,^HEra) and human (Millan, (1987), supra ) lAP genes 
have been identified in the putative promoter region. 
However, one stretch of 10 bp (AGCCACACCC) (SEQ ID NO: 7) 
was found to be identical with a sequence in the same 
region upstream of the TATA box of the human fl-globin gene 
10 (Myers et al., sugra) . 



30 



Another region of interest precedes the putative 
poly adenylation site at bp 5016. The sequence 
ACAGAGAGGAGA (SEQ ID NO: 8) is imperfectly repeated, spaced 
by an inverted repeat overlapping the last adenine 

15 nucleotide (ACAG-T-GACA) . The presented 1.5 kb of the 
presumed promoter of the b.IAP gene contain several 
additional putative regulatory elements. A short stretch 
of 14 alternating thymines and guanines, intercepted by one 
adenine was found at position 601 of the sequence. 

20 Interestingly, this sequence is identical to a part of a 
slightly longer stretch with the same characteristics 
beginning at bp 2713 within the intron splitting exon V and 
VI. Another stretch of 36 alternating pyridines and 
purines is found at position 732 being mainly composed of 

25 cytosin and adenine nucleotides. Identical structures are 
reported for the human germ cell AP gene (Millan and Manes, 
supra) and are thought to form Z-DNA structures, which may 
play a role in the regulation of gene expression (Nordheim 
and Rich, Mature (London) 303:674-678 (1983)). 



As shown in Figure 3, the deduced amino acid 
sequence of b.IAP is highly homologous to all known lAPs. 
Identical residues and conservative amino acid 
substitutions are found within structurally important 
regions, as is the case for the other TSAPs as well. 
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whereas variability is almost exclusively found at the C- 
terminus and in the highly varied>le loops (Millan, (1988), 
supra ) • 

Asp^°^ of b.IAP resides within a conserved sequence 
5 of 4 amino acids in the same region of the human intestinal 
gene (indicated in Figure 3) as well as of human FLAP 
(Millan, J, Biol, Chem. 261:3112-3115 (1986)), This 
residue was shown for FLAP to be the attachment site of a 
phosphatidyl-inositol membrane anchor (Hicanovic et al., 

10 Froc. Natl, Acad. Sci. PSA 87:157-161 (1990)). Evidence 
has been presented previously that b.IAF is also anchored 
to the plasma membrane in such a fashion. There appears to 
be a spatial regulated release of lAF into the lumen 
without cleavage of the anchor in a variety of species 

15 (Hoffmann-Blume et al., Eur . J . Biochem . 199:305-312 
(1991)). 

EXAMPLE IX 

Comparison of Purified and Recombinant Forms of Calf TAP 

Values for K„ and for L-Phe were determined for 
20 the recombinant enzyme as well as for purified protein from 
calf intestine as described in Hummer and Millan, supra, 
and Wilkinson, Biochem. J. 8:324-332 (1961), incorporated 
herein by reference. Both the purified b.XAP from natural 
sources and the recombinant b.IAP show identical values for 
25 (within standard deviations) , and only slightly different 
values of Kj.. was determined as 0.77 = 0.12 for the 

recombinant enzyme and as 0.86 ± 0.17 for the purified 
natural enzyme. Ki for L-Phe were found to be 15.2 ± 1.8 
and 11.2 ± 1.0 for the recombinant and purified enzymes, 
30 respectively. Thus, the results of these findings indicate 
that the natural and recombinant forms of calf lAP have 
comparable properties and activities. 

Two possible glycosylation sites appear to be 
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conserved between the human and the bovine lAP. Three 
other possible sites within other ZAP sequences were not 
found" in the b.IAP. The high degree of heterologous 
glycosylation of the purified enzyme was demonstrated by 
5 isoelectric focusing (lEF). lEP was performed using the 
Resolve-ALP system (Isolab, Akron, OH) as described in 
^^i^^iths & Black, Clinn. Chem. 33:2171-2177 (1987). 
Samples of recombinant and purified enzyme were run either 
treated with neuraminidase or untreated to compare the 
10 amount of. glycosylation. 



A smeary band was obtained upon lEF of untreated 
purified enzyme in contrast to a more distinct band for the 
recombinant b.IAP protein. After treatment with 

neuraminidase, both bands dissolve into several sharp 
bands, in which the purified enzyme showed considerably 
more diversity than the recombinant enzyme. 



EXAMPLE X 
Heat Inactivation of Calf TAP 



20 



The heat stabilities of purified calf lAP and 
recombinant calf lAP were determined at 56 *C. First, the 
enzyme samples were diluted in 1 ml of DEA buffer 
containing 1 M DEA diethanolamine (pH 9.8) containing 0.5 
mM MgClj and 20 juM. ZnCla. The solution was heated at 56 •C 
for the fixed time intervals indicated in Table I. Fifty 
25 of the enzyme solution were removed and pipetted into a 
microtiter well and stored on ice until the end of the 
longest incubation period. At the end of the experiment, 
the residual activity was measured by the addition of 200 
Ail of DEA buffer containing p-nitrophenylphosphate (10 mM) 
30 in DEA buffer. For comparison, a sample of recombinant 
enzyme was pretreated with 0.2 units/ml of neuriminidase 
for 16 hours at room temperature, followed by the same heat 
inactivation treatment. The results of the heat 
inactivation studies are shown in Figure 4. 
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TABLE I 

Heat Inactivation of Intestinal AP 

Time ( minutes ) 

5 Residual activity (%) 

Calf lAP 

(intestinal 100 B7 65.6 48.7 36 23.4 

extract ) 

Recombinant lAF 100 80.6 59.5 39.6 28.5 18.5 

10 Recombinant lAP 
upon 

Heuriminidase 100 80.8 55.9 38.1 27.1 20.3 

The foregoing description of the invention is 
15 exemplary for purposes of illustration and explanation. It 
should be understood that various modifications can be made 
without departing from the spirit and scope of the 
invention. Accordingly, the following claims are intended 
to be interpreted to embrace all such modifications. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANTS 

(A) NAME: La Jolla Cancer Research Foundation 

(B) STREET: 10901 North Torry Pines Road 

(C) CITY: La jrolla 

(D) STATE: California 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP): 92037 

(G) TELEPHONE: (619) 455-6480 

(H) TELEFAX: (619) 455-0181 

(ii) TITLE OF INVENTION: RECOMBINANT CALF INTESTINAL ALKALINE 
PHOSPHATASE 



(iii) NUMBER OF SEQUENCES: 13 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version 1.25 (EPO) 

(Vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/849,219 

(B) FILING DATE: lO-MAR-1992 



(2) INFORMATION FOR SEQ . ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

30 Met Lys Asn His Glu 

1 5 



(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
GCTAGCCAT6 CAGGGGGCCT GCG 
40 (2) INFORMATION FOR SEQ ID NO: 3: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TITPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID HO: 3: 
TCGGCCGCCIT GAA.GGAGC 18 
(2) INFORH&TION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ix) FEATURE; 

10 (A) NAUE/KEY: misc_feature 

(B) LOCATION: complement: (1) 

(D) OTHER INFORMATION: /note= "N=C OR A" 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 
15 (B) LOCATION: complement (2) 

(D) OTHER INFORMATION: /note» "N^AG OR GT" 
(ix) FEATURE: 

(A) NAME/KEY: misc feature 

(B) LOCATION: compXement (3) 

20 (D) OTHER INFORMATION: /note= ■'N'^A OR G'* 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
NNNAGT g 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 4 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 
30 (A) NAME/KEY: misc_feature 

(B) LOCATION: complement (1) 

(D) OTHER INFORMATION: /notes "Y«T OR C 

(ix) FEATURE: 

(A) NAME/KEY: misc feature 
35 (B) LO CAT ION : complement (3) 

(D) OTHER INFORMATION: /note^ "Y-C OR T" 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: complement (4) 

40 (D) OTHER INFORMATION: /note^ '•Y=AG OR G" 



(Xi) 

YNYY 



SEQUENCE DESCRIPTION: SEQ ID NO:5: 



4 
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(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CBARACTERISTICS : 

(A) I.ENGTH: 39 base paire 

(B) TYPE: nucleic acid 

^ (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
CACACCTCCT GCCCAGNNNN NNNCTGGTGA GGAGCTGAG 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

AGCCACACCC 

10 

(2) INFORMATION FOR SEQ ID NO: 8: 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
ACAGAGAGGA GA 

12 

25 (2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5399 base pairs 

(B) TYPE: nucleic acid 
STRANDEDNESS: single 

(D) TOPOLOGY: ^- 



(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATXmE: 

(A) NAME/KEY: CDS 

3 (B) location: 3oin( 1501. 1567, 1647,. 1763, 1878.. 1993, 2179 

'ial?' ?i7R*if«' 2864, .2998, 3084. .3156, 3257 
..3391, 3475.. 3666, 3879.. 3995, 4101.. 4402) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
AAGCTTTCAC CTTCTCTGAA AACAGAGAGA CAGTCCTCAG CCCCAGTCCT CACCCTTCCT 
ACCTCCCTGC CTGATGCCCA GGCAATCATC TGGTGGCGTG TCACCTCCCT CTGTCCCATG 
40 AGTTCCACTA GATGTGGCQC TCAAGAAAAA GGGCTTCCCT GTTGGCTCAG CTGGTAAAGA 



60 
120 
180 
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ATCCZCCAGC AATGXAGGAG ACCTGGGTTC GATCCCTGGG TTGGGAGG&T ACCCTGGAGA 

* 

AGGGAATGGC TACCCACTCC AGTATTCTTG CCTGGATAAT CCCATGGACA GAGGAGTCTG 
GCAGGCTGCA GACCATAAGG TAGAAAGAGT CAGACATGAC TGAGCAACTA AGCACAATAT 
TCCACTOGAT ATATCATACT TTGTTCATCC ATTXGTCTGC TCTCGATGGT TGAGTGGCTT 
5 GTGCCOfCTTG GCTACTGTGA GTAATGCTAC TAAAAa?GTGA GTGTGCAAAT ACCTCTTATA 
GATCTTGATT TCAATTATTG GGGATACACA CCCAGAAGGC GGATTGTTGG ATOTGAGAAT 
GCCTTTTTGA ACCCCAACCT GGGGOTACTG AAACCCTAGC TCCTTATCAG AAGCTGTTCC 
TGTGAGTGTG TGXGGCCTCT GGAGAGAAGA GACTCACCTC T6CCTTCCAT TTACCTCTCC 
AATGGAGCAG AGGTTGCAAA CTTCAGTTAA TGGGCACOJGG GCCCACGCCT GTCGACCC6T 
10 «CACAGGCACC TTACACACAC ACACACACAC ACACACACAC ACAAACAGCA CTGCAGACCC 
AGCTCTTCAG TAACTGAAGA CACAGACAAG GCCCCCGCTC TGCTGTCACC TCCAGTCCCA 
TCCTTCTCCA CAGCAGAAGC TGGGCCCAGG CTCCCATGTG CCCCCACTAG CCCAGTGCCC 
ACACCTCCSG CCCAGGTCAA GTCTGGTGAG GAGCTGAGCA GGGGGCAGGG CAGACAGGCC 
TCCCCGTGGA TCTCTGTCTC AGGGCGCGAG GGAACZAACC CAGGCCCCTG GCCAGGCTCT 
15 GTCCCTAAGC ACTGGGAACC AAACCAGGCC AAGGCTGAGT CTCAGAAAAC ACTGAACACG 
TGAAGGAAGG AGAGATGGTT CTCCCACAGG ACTTGGTCAG CAGAGGGCTG GGAGGAGCCT 
CAGTCAGGAC CTTGAAAACG TTCCTCAGGC CTAGACATCT GCACCCTAAT CCCCACCCCA 
CCCTGAGGAG ACAGCTGGGA CCATCCT6GG AGGGAGGGAC CTGAATCCTC AGGACCCCTA 
CTGCTAAGCC ACACCCACCA CATGCCCCTG GCAACAGGGC TCAAAGTCAT AGGGCAGGTG 
20 AGGGGCAGGG TGTGGCCACC CGGGGAACCT GGGATGGACA AGGAGACTTT AATAGCAGGG 
ACAAAGTCTA TCTAGATTTA AGCCCAGCAG GCCAAGCTGC AGCCGGTCCC TGGTGTCCCA 
GCCTTCCCCT GAGACCCGGC CTCCCCAGGT CCCATCCTGA CCCTCTGCCA TCACACAGCC 



25 



ATG CAG GGG GCC TGC GTG CTG CTG CTG CTG GGC CTG CAT CTA CAG CTC 

Met Gin Gly Ala cys Val Leu Leu Leu Leu Gly Leu His Leu Gin Leu 
15 10 15 



TCC CTA GGC CTC GTC CCA 

ser Leu Gly Leu val Pro 

20 



G GTAATCAGGC GGCTCCCAGC AGCCCCTACT 



240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1548 



1597 



30 



CACAGG66C6 GCTCTAGGCT GACCTGACCA ACACTCTCCC CTTGGGCAG TT GAG 

val Glu 



1651 



GAG GAA GAC CCC GCC TTC TGG AAC CGC CAG 

Glu Glu Asp Pro Ala Phe Trp Asn Arg Gin 
25 30 



GCA GCC CAG GCC CTC GAT 

Ala Ala Gin Ala Leu Asp 
35 40 



35 GTG GCT AAG AAG CTG CAG CCC ATC CAG ACA GCC GCC AAG AAT GTC ATC 

Val Ala Lys Lys Leu Gin Pro lie Gin Thr Ala Ala Lys Asn Val lie 

45 50 55 

CTC TTC TTG GGG GAT G GTGAGTACAT GAGGCCAGCC CACCCCCTGT 

Leu Phe Leu Gly Asp 
40 60 



1699 



1747 



1793 
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CCCCTGACAG GCCTGGAACC CTCTGA-TGCC GGCTGftCCCA GGTOIGGCCCC AGAAACTCGG 1853 
ACCTGftGACA CT6TGTACCT TCAG G6 AXQ <SGG GT6 CCT ACG 6TG ACA GCC 1903 

Gly Met Gly val Pro Thr Val Thr Ala 

65 70 

^ ^ T?f ^ """^ ^ GGC AAA CTG GGA CCT GAG ACA 1951 

Thr Arg He ten Lys Gly Gin Met Asn Gly Lys Leu Gly Pro Glu Thr 

75 80 85 

CCC CTG GCC ATG GAC CAG TTC CCA TAC GTG GCT CTG TCC AAG 1003 

Pro Leu Ala Met Asp Gin Phe Pro Tyr Val Ala Leu Ser ^ 

90 95 

GTAAGGCCAA 6TGGCCTCAG GGTGGTCTAC ACCAQAGGG6 TGGGTGTGGG CCTAGG6AGC 2053 
AGGGTAGG&G G6AAACCCAG 6AGGGCTAGG 6GCTGAGATA GGGGCTGGGG GCTGTGAGGA 2113 
TGGGCCCAGG GCTGGGTCAG GAGCTGGGTG TCTACCCAGC AGAGCGTAAG GCATCTCTGT 2173 

15 ^ '^'^ ^''^ ^«C GCA GGC ACT 2220 

la Thr Tyr Asn Val Asp Arg Gin Val Pro Asp ser Ala Gly Thr 

105 no 

GCC ACT GCC TAC CTG TGT GGG GTC AAG GGC AAC TAC AGA ACC ATT GCT l-ita 

Ala Thr Ala Tyr Leu cys Gly Val Lys Gly Asn ^% tS iS 

120 125 

?at sor 11^ f?*" f^*" AAC CAG TGC AAA ACG ACA CGT GGG AAT 2316 

Val ser Ala Ala Ala Arg Tyr Asn Gin Cys Lys Thr Thr Arg Gly Asn 

145 

Su?SJSser5S^iiSf^f?*=*^^^«^ GGTGGGCTTGG 2363 
taiu vaj. Tnr ser Val Met Asn Arg Ala Lys Lys Ala 

150 155 

GC6TCAGCTT CCTGGGCaGG GACGGGCTCA GAGACCTCAG TGGCCCACCG TGACCTCTGC 2423 
CACCCTCAG ^GG ^ <5T6 GGA GTG GTG ACC ACC ACC AGG GTG CAG 2470 

Val 
170 



«i ~«» BTO GTG ACC ACC ACC AGG GTG CAG 

Gly Lys ser Val Gly val Val Thr Thr Thr Arg val Gin 
160 ' 



30 



CAT GCC TCC CCA GCC GGG GCC TAC GCG CAC ACG GTG AAC CGA AAC TGG 2518 

HIS Ala ser Pro Ala Gly Ala Tyr Ala His Thr Val Asn Arg Asn ^ 



Ivl sS 2n f«« f''' l'''^ "^"^ "^"^ <^ 6GC TGC CAG 2566 

35 ^^"^ ^^"^ Jig ^« Pi^o Ala Asp Ala Gin Met Asn Gly cys Gin 

200 

2o fll Sa Sf^ ^ *™ '^'^ *™ 6TGCGACATG 2615 

,05 Asp He Asp 

210 215 

TTGGGCACAG GGCGGGGCTG GGCACAGGTG GTGGGGCACA CTCGCAACAC AGTCGTAGGT 2675 

40 AACCTCCAGC CTGCGGT6TT TCAGGGTTTT CATGGGTTTG TGTGTGTGTG TATGTGTGGT 2735 

6GGGTGGCAC CATGTAGGAG GTGGGGACAG GCCTTTCCCA CAGACCTGGT GGGGGAGGTA 2795 

GGGGCT6TGT GAGAGGAGTA AAGGGCCAGC CAGGCCCCTA ACCCACCT6C CTAACTCTCT 2855 

GGCTCCAG GTG ATC CTG GGT GGA GGC CGA AAA TAC ATG TTT OCT GTG GGG 2905 

val He Leu Gly Gly Gly Arg Lys Tyr Met Phe Pro vS Gly 

220 225 230 
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10 



ACC CC& GhC OCT GAA TAG CCA GAT GAT GCC 

Thr pro Asp Pro Glu Tyr Pro Asp Asp Ala 

235 240 



GTG AAT GGA GTC 

val Asn Gly val 

245 



AAG CGA AAG GAG AAC CTG GTG CAG GCA TGG CAG GCC AAG CAC CAG 

liva Arg Lys Gin Asn Leu Val Gin Ala Trp Gin Ala Lys His Gin 
^ 250 255 260 

GTAATGGGGG CTCACGGATG TGGGGGTACA GTGGGGCTGG GCCTGGGGTG TCGGCTATGG 

CTGAGGCCTG GTTCTGCCCT CCCAG GGA GCC CAG TAT GTG TGG AAC CGC ACT 

Gly Ala Gin Tyr val Trp Asn Arg Thr 

265 270 

CTC CTT CAG GCG GCC GAT GAC TCC AGT GTA ACA CAC CTC ATG G 
Z.eu Leu Gin Ala Ala Asp Asp Ser ser Val Thr His Leu Met 

275 280 285 



15 TCCTCAGCTT GAGGGTCACC 



ACTGCTCCCC TTTCCCACAG GC CTC TTT GAG GCG 

Gly Leu Phe Glu Pro 

290 



2953 



2998 



3058 
3110 



3156 



GTAAG6AGTG CAGGCACCCT CACTGTCCTC CGCAGGAATG GGTGGCATGG GCCACGGCTG 3216 



3270 



20 



25 



30 



35 



40 



GCA GAC ATG AAG TAT AAT' GTT CAG CAA GAC CAC ACG AAG GAC GCG AGG 

Ala Asp Met Lys-Tyf Asn Val Gin Gin Asp His Thr Lys Asp Pro Thr 

-■"295 ' 300 305 

CTG CAG GAA ATG ACA GAG GTG GCC CTG CGA GTC GTA AGC AGG AAC GCC 

Leu Gin Glu Met Thr Glu Val Ala Leu Arg Val val Ser Ai^g Asn Pro 

310 315 320 



AGG GGC TTC TAG CTC TTT GTG GAG 

Arg Gly Phe Tyr Leu Phe Val Glu 
325 330 



G GTGAGTGGCA GCGCGTTGGT 



GAAGAGAGGT GTGATGAGGG CGATGAGGGT GGGTTTGGTA TGTTATATGT GAGTTATGTG 



GAG GA GGC CGC ATT GAC 

Gly Gly Arg lie Asp 

335 



GGT CAC CAT GAT GAC AAA GCT TAT ATG 

Gly His His Asp Asp Lys Ala Tyr Met 

340 345 



GTG AGG GAG GCG GGT ATG TTT GAC AAT GCG ATG 

Leu Thr Glu Ala Gly Met Phe Asp Asn Ala lie 

350 355 



AAG GGT AAT 

Lys Ala Asn 

360 



GAG CTC ACT AGC GAA GTG GAC ACG CTG ATG GTT GTG AGT GCA GAG CAG 

Glu Leu Thr Ser Glu Leu Asp Thr Leu lie Leu Val Thr Ala Asp His 

365 370 375 

TCT CAT GTC TTC TCT TTT GGT GGC TAT ACA CTG CGT GGG ACG TCC ATT 

Ser His Val Phe Ser Phe Gly Gly Tyr Thr Leu Arg Gly Thr Ser lie 
380 385 390 

TTT G GTAAGCGCAG GGAGAGTGGC AGGTCGTTGC CCCTAAGTTA CGAGGCACAA 
Phe 



3318 



3366 



3411 



3471 
3518 



3566 



3614 



3662 



3716 



GTGGTGTGAG GCAGTTCCTC TATCTGTCTA GTGGGGTAGT AGAGCACACT GCCTGGTACG 3776 

GTCTGGTGAG GATTGTCACT GAGAGAGAGA GTGGGGATGG CTGTGGAGAG AGGGGAGCAC 3836 

45 AAGGTAGGTG AGTGTGATCA C6GGGTCCCC TCTTCGGTGA AG GT GTG GCG GGG 3889 

Gly Leu Ala Pro 
395 
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AGC AAG GCC TTA GAC AGC AAG TCC TAC ACC TCC ATC CTC TAT GGC AAT 

ser Lye Ala I<eu Asp ser Lys Ser Tyr Thr Ser lie Leu Tyr Gly Asn 
400 405 410 

GGC CCA GGC TAT GCG CTT GGC GGG GGC TCG AGG CCC GAT GTT AAT GAC 

Gly Pro Gly Tyr Ala Leu Gly Gly Gly ser Arg Pro Asp Val Asn Asp 
4" 420 425 430 

AGC ACA AGC G GTAAGTGTAG TAGGTGGGGC GCTGGGAGGT GGGGACCCTG 

Ser Thr ser 



3937 



3985 



4035 



10 GCCAGAAATT GTGGGGAGGG GAAGGCTGCC TCCCTTGTCA CATTAACTTC CCTTCTTCTG 
GCCAG AG GAC CCC TCG TAC CAG CAG CAG GCG GCC GTG CCC CAG GCT 

Glu Asp Pro Ser Tyr Gin Gin Gin Ala Ala Val Pro Gin Ala 
435 440 445 



15 



20 



25 



30 



AGC GAG ACC CAC GGG GGC GAG GAC GTG 

Ser Glu Thr His Gly Gly Glu Asp Val 
450 455 



GTG TTC GCG CGC GGC CCG 

Val Phe Ala Arg Gly pro 
460 



CAG GCG CAC CTG GTG CAC GGC GTC GAG GAG GAG ACC TTC GTG GCG CAC 

Gin Ala His Leu Val His Gly Val Glu Glu Glu Thr Phe Val Ala His 
465 470 475 

ATC AT6 GCC TTT GCG GGC TGC GTG GAG CCC TAC ACC GAC TGC AAT CTG 

lie Met Ala Phe Ala Gly Cys Val Glu Pro Tyr Thr Asp Cys Asn Leu 
480 485 490 495 

CCA GCC CCC ACC ACC GCC ACC AGC ATC CCC GAC GCC GCG CAC CTG GCG 

Pro Ala Pro Thr Thr Ala Thr Ser He Pro Asp Ala Ala His Leu Ala 

500 505 510 



GCC AGC CCG CCT CCA CTG GCG CTG CTG GCT GGG GCG ATG 
Ala Ser Pro Pro Pro Leu Ala Leu Leu Ala Gly Ala Met 

515 520 

CTG GCG CCC ACC TTG TAC TAACCCCCAC CAGTTCCAGG TCTCGGGATT 

Leu Ala Pro Thr Leu Tyr 
530 



CTG CTG 
Leu Leu 



4095 
4141 



4189 



4237 



4285 



4333 



4381 



4429 



35 



40 



TCCCGCTCTC 


CT6CCCAAAA 


CCTCCCAGCT 


CAGGCCCTAC 


CGGAGCTACC 


ACCTCAGAGT 


4489 


CCCCACCCCG 


AA6TGCTATC 


CTAGCTGCCA 


CTCCTGCAGA 


CCCGACCCGG 


CCCCACCACC 


4549 


AGAGTTTCAC 


CTCCCAGCAG 


TGATTCACAT 


TCCAGCATTG 


AAGGAGCCTC 


AGCTAACAGC 


4609 


CCTTCAAGGC 


CCAGCCTATA 


CCGGAGGCTG 


AGGCTCTGAT 


TTCCCTGTGA 


CACGCGTAGA 


4669 


CCTACTGCCC 


GACCCCAACT 


TCGGTGGCTT 


GGGATTTTGT 


GTTCTGCCAC 


CCTGAACCTC 


4729 


AGTAAGGGGG 


CTCGGACCAT 


CCAGACTGCC 


CCTACTGCCC 


ACAGCCCACC 


TGAGGACAAA 


4789 


GCTGGCACGG 


TCCCAGGGGT 


CCCAGGCCCG 


GCTGGAACCC 


ACACCTTGCC 


TTCAGCGACC 


4849 


TGGACTCTGG 


GTTCGGAGAG 


TGGCTTCGGG 


AGGCGTGGTT 


TCCGATGGGC 


GTGCTCTGGA 


4909 


ACGTGCTCGC 


CTGAACCAAC 


CTGTGTACAC 


TGGCCAGGAA 


TCACGGCCAC 


CAGAGCTCGG 


4969 


ACCTGACAGA 

* 


GCCCTCAGCA 


GCCCCTCCTA 


GACCAACGTA 


CCCATTACAG 


AGAGGAGACA 


5029 


GTGACACAGA 


GGAGAGGAGA 


CTTGTCCCAG 


GTCCCTCAGC 


TGCTGTGAGG 


GCGGCCCTGG 


5089 


TGCCCCTTCC 


AGGCTGGGCA 


TCCCAGTAGC 


AGGAGGGGAC 


CCGGGGGTGG 


GGACACAGGC 


5149 


CCCGCCCTCC 


CTGGGAGGCA 


GGAAGCAGCT 


CTCAAATAAA 


CTGTTCTAAG 


TATGATACAG 


5209 
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G&GTGATACA TGTGTGAAGA GAAGCCCTTA GGTGGGGGCA CAGAGTGTCT GGGTGAGGGG 5269 
GGTCAGGGTC ACATCAGGAG GTTAGGGAGG GGTTGAXGAA GGGCTGACGT TGAGCAAAGA 5329 
CCAAAGGCAA CTCAGAAGGA GAGTGGTGGA GGACTGGGTG TGGTCAGCAG GGGGACTG6T 5389 
TGGGGGATCC 5399 

5 (2) ZNFORKA^rON FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH r 533 amino acids 

(B) TYPE: amino acid 
(D) TOPOliOGY: linear 



10 (Ki) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Gla Gly Ala cys Val Leu Leu Leu Leu Gly Leu His Leu Gin Leu 
15 10 15 



Ser Leu Gly Leu Val Pro Val Glu Glu Glu Asp Pro Ala Phe Trp Asn 
15 20 25 30 



30 



Arg Gin Ala Ala Gin Ala Leu Asp Val Ala Lys Lys Leu Gin Pro lie 
35 40 45 



20 Gin Thr Ala Ala Lys Asn Val He Leu Phe Leu Gly Asp Gly Met Gly 

50 55 60 



Val Pro Thr Val Thr Ala Thr Arg He Leu Lys Gly Gin Met Asn Gly 
65 70 75 80 

25 Lys Leu Gly Pro Glu Thr Pro Leu Ala Met Asp Gin Phe Pro Tyr Val 

85 90 95 



Ala Leu Ser Lys Thr Tyr Asn Val Asp Arg Gin Val Pro Asp Ser Ala 

100 105 110 



Gly Thr Ala Thr Ala Tyr Leu cys Gly Val Lys Gly Asn Tyr Arg Thr 
115 120 125 



lie Gly Val Ser Ala Ala Ala Arg Tyr Asn Gin cys Lys Thr Thr Arg 
35 130 135 140 



Gly Asn Glu Val Thr Ser Val Met Asn Arg Ala Lys Lys Ala Gly Lys 
145 150 155 160 
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10 



15 



30 



31 

ser val Gly Val Val Thr Thr Thr Arg Val Gin His Ala ser Pro Ala 

165 170 175 



Gly Ala Tyr Ala His Thr Val Asn Arg Asn Trp Tyr Ser Asp Ala Asp 

180 185 190 



Leu Pro Ala Asp Ala Gin Met Asn Gly cys Gin Asp lie Ala Ala Gin 
195 200 205 



Leu val Asn Asn Met Asp He Asp Val He Leu Gly Gly Gly Arg Lys 
210 215 220 



Tyr Met Phe Pro Val Gly Thr Pro Asp Pro Glu Tyr Pro Asp Asp Ala 
225 230 235 240 

Ser val Asn Gly Val Arg Lys Arg Lys Gin Asn Leu Val Gin Ala Trp 

245 250 



Gin Ala Lys His Gin Gly Ala Gin Tyr Val Trp Asn Arg Thr Ala Leu 

260 265 270 

20 

Leu Gin Ala Ala Asp Asp ser ser Val Thr His Leu Met Gly Leu Phe 
2'5 280 285 



25 Mfl **** His l-ys Asp 

295 300 



Pro Thr Leu Gin Glu Met Thr Glu Val Ala Leu Arg val Val ser Ara 

310 315 



Asn Pro Arg Gly Phe Tyr Leu Phe Val Glu Gly Gly Arg lie Asp His 

325 330 



Gly His His Asp Asp Lys Ala Tyr Met Ala Leu Thr Glu Ala Gly Met 

340 345 



35 Phe Asp Asn Ala He Ala Lys Ala Asn Glu Leu Thr ser Glu Leu Asp 

j55 350 



365 



Thr Leu lie Leu Val Thr Ala Asp His ser His Val Phe . Ser Phe Gly 



380 



40 
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Gly Tyr Thr Leu Arg Gly Thr ser lie Phe Gly lieu Ala Pro ser Lys 
385 390 395 400 

Ala lisu Asp ser Lys ser Tyr Thr ser He Leu Tyr Gly Asn Gly Pro 

405 410 415 



25 



Gly Tyr Ala Leu Gly Gly Gly Ser Arg Pro Asp val Asn Asp ser Thr 

420 425 430 



Ser Glu Asp Pro Ser Tyr Gin Gin Gin Ala Ala Val Pro Gin Ala Ser 
10 435 440 445 



Glu Thr His Gly Gly Glu Asp Val Ala Val Phe Ala Arg Gly Pro Gin 
450 455 460 



15 Ala His Leu Val His Gly Val Glu Glu Glu Thr Phe Val Ala His He 

465 470 475 480 

H&t Ala Phe Ala Gly Cys Val Glu Pro Tyr Thr Asp cys Asn Leu Pro 

485 490 495 



20 Ala Pro Thr Thr Ala Thr ser He Pro Asp Ala Ala His Leu Ala Ala 

500 505 510 



Ser Pro Pro Pro Leu Ala Leu Leu Ala Gly Ala Met Leu Leu Leu Leu 
515 520 525 



Ala Pro Thr Leu Tyr 
530 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 
30 (A) LENGTH: 540 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Gin Gly Asp Trp Val Leu Leu Leu Leu Leu Gly Leu Arg lie His 
35 1 5 10 15 



Leu Ser Phe Gly Val He Pro Val Glu Glu Glu Asn Pro Val Phe Trp 

20 25 30 



40 



Asn Gin Lys Ala Lys Glu Ala Leu Asp Val Ala Lys Lys Leu Gin Pro 
35 40 45 
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10 



15 



30 



Ilo Gin Thr Ser Ala Lys Asn Leu He Leu Phe Leu Gly Asp Gly Met 
50 55 60 



Gly val Pro Thr Val Thr Ala Thr Arg He Leu Lys Gly Gin Leu Gly 
^5 70 75 80 

Gly His Leu Gly Pro Glu Thr Pro Leu Ala Met Asp Bis Phe Pro Phe 

85 90 95 



Thr Ala Leu Ser Lys Thr Tyr Asn Val Asp Arg Gin Val Pro Asp ser 

100 105 110 



Ala Gly Thr Ala Thr Ala Tyr Leu cys Gly Val Lys Ala Asn Tyr Lys 
115 120 125 



""^ f i« ^^'^ ^9 Phe Asn Gin cys Asn Ser Thr 

130 135 



5n His Arg Ala Lys Lys Ala Gly 

150 155 160 

Lys ser val Gly Val Val Thr Thr Thr Arg Val Gin His Ala ser Pro 

165 170 175 



Ala Gly Thr Tyr Ala His Thr Val Asn Arg Asp Trp Tyr ser Asp Ala 

180 185 190 



Asp Met Pro ser Ser Ala Leu Gin Glu Gly cys Lys Asp He Ala Thr 

200 205 



Gin Leu He ser Asn Met Asp He Asp val He Leu Gly Gly Gly Arg 
210 215 220 



Lys Phe Met Phe Pro Lys Gly Thr Pro Asp Pro Glu Tyr Pro Gly Asp 
"5 230 235 240 

35 ser Asp Gin Ser Gly Val Arg Leu Asp ser Arg Asn Leu Val Glu Glu 

245 250 255 



Trp Leu Ala Lys Tyr Gin Gly Thr Arg Tyr Val Trp Asn Arg Glu Gin 

260 265 270 



40 
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Leu Met Gin Ala Ser Gin Asp Pro Ala Val Thr Arg Leu Met Gly Leu 
275 280 285 



Phe Glu Pro Thr Glu Met Lys Tyr Asp Val Asn Arg Asn Ala Ser Ala 
5 290 295 300 



Asp Pro ser Leu Ala Glu Met Thr Glu Val Ala Val Arg Leu Leu Ser 
305 310 315 320 

Arg Asn Pro Gin Gly Phe Tyr Leu Phe val Glu Gly Gly Arg lie Asp 
10 325 330 335 



20 



25 



Gin Gly His His Ala Gly Thr Ala Tyr Leu Ala Leu Thr Glu Ala val 

340 345 350 



15 Met Phe Asp Ser Ala lie Glu Lys Ala ser Gin Leu Thr Asn Glu Lys 

355 360 365 



Asp Thr Leu Thr Leu lie Thr Ala Asp His Ser Bis Val Phe Ala Phe 
370 375 380 



Gly Gly Tyr Thr Leu Arg Gly Thr Ser lie Phe Gly Leu Ala Pro Leu 
385 390 395 400 

Asn Ala Gin Asp Gly Lys ser Tyr Thr ser Xle Leu Tyr Gly Asn Gly 

405 410 415 



Pro Gly Tyr val Leu Asn Ser Gly Asn Arg Pro Asn Val Thr Asp Ala 

420 425 430 



Glu ser Gly Asp Val Asn Tyr Lys Gin Gin Ala Ala Val Pro Leu ser 
30 435 440 445 



Ser Glu Thr His Gly Gly Glu Asp Val Ala lie Phe Ala Arg Gly Pro 
450 455 460 



35 Gin Ala His Leu Val His Gly Val Gin Glu Gin Asn Tyr lie Ala His 

465 470 475 480 

Val Met Ala Phe Ala Gly cys Leu Glu Pro Tyr Thr Asp Cys Gly Leu 

485 490 495 



40 



Ala Pro Pro Ala Asp Glu Asn Arg pro Thr Thr Pro val Gin Asn ser 

500 505 510 
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Ala He Thr Met nan Asn Val Leu leu ser Leu Gin Leu Leu Val ser 
515 520 



5 Met Leu Leu Leu Val Gly Thr Ala Leu Val Val ser 

530 535 

(2) INFORMATION FOR SEQ 10 NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

. . (A) LENGTH: 559 anino acids 

(B) TYPE: amino acid 
(O) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 12: 

Met Gin Gly Pro Trp val Leu Leu Leu Leu Gly Leu Arg Leu Gin Leu 

5 10 IK 



15 



ser Leu ser Val He Pro Val Glu Glu Glu Asn Pro Ala Phe Trp Asn 

20 25 30 



Lys Lys Ala Ala Glu Ala Leu Asp Ala Ala Lys Lys Leu Gin Pro He 
•^^ 40 45 

Gin Thr ser Ala Lys Asn Leu He He Phe Leu Gly Asp Gly Met Gly 



25 



30 



35 



40 



60 



val Pro Thr val Thr Ala Thr Arg He Leu Lys Gly Gin Leu Glu Gly 

75 80 

His Leu Gly Pro Glu Thr Pre Leu Ala Met Asp Arg Phe Pro Tyr Met 

85 90 95 



Ala Leu ser Lys Thr Tyr ser Val Asp Arg Gin Val Pro Asp ser Ala 



ser Thr Ala Thr Ala Tyr Leu cys Gly val Lys Thr Asn Tyr Lys Thr 

12 5 



He Gly Leu Ser Ala Ala Ala Arg Phe Asp Gin Cys Asn Thr Thr Phe 

135 140 



Gly Asn Glu val Phe Ser val Met Tyr Arg Ala Lys Lys Ala Gly Lys 

155 
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Ser val Gly Val Val Thr Thr Thr Arg Val Gin His Ala ser Pro ser 

165 170 175 



Gly Thr Tyr Val His Thr Val Asn Arg Asn Trp Tyr Gly Asp Ala Asp 

180 185 190 



Met Pro Ala Ser Ala lieu Arg Glu Gly Cys l<ys Asp Xle Ala Thr Gin 
195 200 205 



10 Leu lie Ser Asn Met Asp Xle Asn Val Xle lieu Gly Gly Gly Arg Lys 

210 215 220 



Tyr Met Phe Pro Ala Gly Thr Pro Asp Pro Glu Tyr Pro Asn Asp Ala 
225 230 235 240 

15 Asn Glu Thr Gly Thr Arg Leu Asp Gly Arg Asn Leu Val Gin Glu Trp 

245 250 



Leu Ser Lys His Gin Gly ser Gin Tyr Val Trp Asn Arg Glu Gin Leu 

260 265 270 



lie Gin Lys Ala Gin Asp Pro Ser Val Thr Tyr Leu Met Gly Leu Phe 
275 280 285 



Glu Pro Val Asp Thr Lys Phe Asp Xle Gin Arg Asp Pro Leu Met Asp 
25 290 295 300 



Pro Ser Leu Lys Asp M^t Thr Glu Thr Ala Val Lys Val Leu Ser Arg 
305 310 315 320 

Asn Pro Lys Gly Phe Tyr Leu Phe Val Glu Gly Gly Arg Xle Asp Arg 
30 325 330 335 



Gly His His Leu Gly Thr Ala Tyr Leu Ala Leu Thr Glu Ala Val Met 

340 345 350 



35 Phe Asp Leu Ala He Glu Arg Ala ser Gin Leu Thr Ser Glu Arg Asp 

355 360 365 



Thr Leu Thr Xle Val Thr Ala Asp His Ser His Val Phe Ser Phe Gly 
370 375 380 
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Gly Tyr Thr Leu Arg Gly Thr Ser lie Phe Gly Leu Ala Pro Leu Asn 
385 390 395 400 

Ala Leu Asp Gly Lys Pro Tyr Thr Ser lie Leu Tyr Gly Asn Gly Pro 

405 410 415 



Gly Tyr Val Gly Gly Thr Gly Glu Arg Pro Asn Val Thr Ala Ala Glu 

420 425 430 



Ser ser Gly ser ser Tyr Arg Arg Gin Ala Ala Val Pro Val Lys Ser 
10 435 440 445 



Glu Thr Bis Gly Gly Glu Asp Val Ala lie Phe Ala Arg Gly Pro Gin 
450 455 460 



15 Ala Bis Leu Val His Gly val Gin Glu Gin Asn Tyr He Ala His val 

465 470 475 480 

Met Ala Ser Ala Gly cys Leu Glu Pro Tyr Thr Asp Cys Gly Leu Ala 

485 490 495 

20 Pro Pro Ala Asp Glu ser Gin Thr Thr Thr Thr Thr Arg Gin Thr Thr 

500 505 510 



He Thr Thr Thr Thr Thr Thr Thr Thr Thr Thr Thr Thr Pro Val His 

520 525 



Asn Ser Ala Arg ser Leu Gly Pro Ala Thr Ala Pro Leu Ala Leu Ala 
530 535 540 



Leu Leu Ala Gly Met Leu Met Leu Leu Leu Gly Ala Pro Ala Glu 
30 545 550 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 528 amino acids 

(B) TYPE: amino acid 
35 (D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Gin Gly Pro Trp Val Leu Leu Leu Leu Gly Leu Arg Leu Gin Leu 
^ 5 10 15 



40 



ser Leu Gly Val lie Pro Ala Glu Glu Glu Asn Pro Ala Phe Trp Asn 

20 25 30 
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10 



15 



30 



35 



40 



Arg Gin Ala Ala Glu Ala Leu Asp Ala Ala Lys Lys Leu Gin Pro lie 
35 40 45 



Gin Lys Val Ala Lys Asn Leu lie Leu Phe Leu Gly Asp Gly Leu Gly 
50 55 60 



Val Pro Thr Val Thr Ala Thr Arg He Leu Lys Gly Gin Lys Asn Gly 
65 70 75 80 

Lys Leu Gly Pro Glu Thr Pro Leu Ala Met Asp Arg Phe Pro Tyr Leu 

85 90 95 



Ala Leu ser Lys Thr Tyr Asn Val Asp Arg Gin val Pro Asp ser Ala 

100 105 110 



Ala Thr Ala Thr Ala Tyr Leu Cys Gly Val Lys Ala Asn Phe Gin Thr 
115 120 125 



lie Gly Leu Ser Ala Ala Ala Arg Phe Asn Gin Cys Asn Thr Thr Arg 
20 130 135 140 



Gly Asn Glu Val He Ser Val Met Asn Arg Ala Lys Gin Ala Gly Lys 
145 150 155 160 

ser Val Gly Val val Thr Thr Thr Arg Val Gin His Ala Ser Pro Ala 

165 170 175 



Gly Thr Tyr Ala His Thr Val Asn Arg Asn Trp Tyr Ser Asp Ala Asp 

180 185 190 



Met Pro Ala Ser Ala Arg Gin Glu Gly cys Gin Asp He Ala Thr Gin 
195 200 205 

Leu He ser Asn Met Asp He Asp Val He Leu Gly Gly Glv Ara Lvs 
210 215 220 



Tyr Met Phe Pro Met Gly Thr Pro Asp Pro Glu Tyr Pro Ala Asp Ala 
225 230 235 240 

Ser Gin Asn Gly He Arg Leu Asp Gly Lys Asn Leu Val Gin Glu Trp 

245 250 255 
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I.eu Ala Lys Bis 6ln Gly Ala Trp Tyr val Trp Asn Arg Thr Glu Leu 

260 265 270 



Met Glu Ala ser Leu Asp Gin Ser Val Thr His Leu Met Gly Leu Phe 
275 280 285 



Glu Pro Gly Asp Thr Lys Tyr Glu lie His Arg Asp Pro Thr Leu Asp 

295 300 



10 Pro ser Leu Met Glu Met Thr Glu Ala Ala Leu Arg Leu Leu Ser Arg 



20 



30 



35 



40 



305 3X0 



315 320 



Asn Pro Arg Gly Phe Tyr Leu Phe val Glu Gly Gly Arg He Asp His 

325 330 ' 335 

15 Gly His His Glu Gly val Ala Tyr Gin Ala Leu Thr Glu Ala Val Met 

345 350 



Phe Asp Asp Ala He Glu Arg Ala Gly Gin Leu Thr Ser Glu Glu Asp 

360 365 *^ 



Thr Leu Thr Leu Val Thr Ala Asp His ser His Val Phe ser Phe Gly 



25 fil ''^ f Gly l-eu Ala Pro Ser Lys 

Ala Gin Asp ser Lys Ala Tyr Thr ser Thr Leu Tyr Gly Asn Gly Pro 

410 



Gly Tyr Val Phe Asn Ser Gly Val Arg Pro Asp val Asn Glu Ser Glu 

*20 425 430 



ser Gly ser Pro Asp Tyr Gin Gin Gin Ala Ala Val Pro Leu 

440 



Glu Thr His Gly Gly Glu Asp Val Ala Val Phe Ala Arg Gly Pro Gin 

^( 0 



Ala His Leu Val His Gly Val Gin Glu Gin Ser Phe Val Ala His Val 

475 480 

Met Ala Phe Ala Ala cys Leu Glu Pro Tyr Thr Ala Cys Asp Leu Ala 

490 495 
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cys Tbr Thr Asp Ala Ala Bis 
500 505 



Val Ala Ala 
510 



Leu Ala Gly Thr Leu Leu Leu Leu Gly Ala Ser Ala Ala Pro 
515 520 
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1. An isolated nucleic acid comprising a 
nucleotide sequence encoding a substantially purified calf 
intestinal alkaline phosphatase or an active fragment 



5 2. The isolated nucleic acid of claim 1 having 

a nucleotide sequence substantially the same as the 
nucleotide sequence of Figure 1. 

3. The isolated nucleic acid of claim 1, 
wherein said nucleic acid is cDNA. 

4. The isolated nucleic acid of claim 1, 
wherein said nucleic acid is RNA. 

5. The isolated nucleic acid of claim 1^ 
further comprising a second nucleotide sequence encoding a 
polypeptide having specific reactivity with a ligand, 

6. A vector con^rising the nucleic acid of 

claim 1« 

7. The vector of claim 6, wherein said vector 
is a plasmid. 

8. A recombinant host cell comprising the 
20 vector of claim 6. 

9. A recombinant polypeptide produced by the 
recombinant host cell of claim 8. 
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10. A method of obtaining reconbineuit calf 
intestinal alkaline phosphatase or an active fragment 
thereof, comprising culturing said recombinant host cell of 
claim 8 and isolating said calf intestinal alkaline 

5 phosphatase or active fragment thereof from said culture. 

11. A cell culture comprising the recombinant 
host cell of claim 8 cultured in a suitable medium. 

12 . A nucleic acid probe comprising a nucleotide 
sequence complementary to a portion of a nucleotide 

10 sequence specific to calf intestinal alkaline phosphatase. 

13. A multifunctional polypeptide comprising an 
amino acid sequence of calf intestinal alkaline phosphatase 
or an active fragment thereof and a second amino acid 
sequence of a reagent having specific reactivity with a 

15 desired ligand. 

14. The multifunctional polypeptide of claim 13, 
wherein said reagent encoded by the second conino acid 
sequence is an antibody. 

15. A method for determining the presence of a 
20 ligand in a sample, comprising: 

(a) contacting said sample with a substantially 
purified calf intestinal alkaline phosphatase and a reagent 
that specifically binds to said ligand, said reagent 
attached to said recombinant calf intestinal alkaline 

25 phosphatase; 

(b) contacting said sample with a detectable 
substrate catalyzed by the recombinant polypeptide; and 

(c) detecting the binding of said sample to the 
reagent, wherein binding indicates the presence of said 

30 ligand in the sample. 
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16. The method of claim 15, further comprising 
the step of (d) determining an amount of binding of said 
sample to the reagent, wherein the amount of binding 
relates to the concentration of said ligand in the sample. 

17. The method of claim 15, wherein said reagent 
is an anti-ligand antibody. 

18. The method of claim 15, wherein said reagent 
and recombinant calf lAP or active fragment thereof are 
attached as a multifunctional polypeptide. 

19. The method of claim 15, wherein said reagent 
is an oligonucleotide. 

20. The method of claim 19, wherein said ligand 
is a cDNA or genomic DNA fragment. 
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AMENDED CLAIMS 

[received by the International Bureau on 9 July 1993 (09. 07*93); 
original claims 1-20 replaced by amended claims 1-25 (4 pages)] 

!• An Isola'ted nucleic acid comprising: 

(a) 'the nuclectide sequence shown in 
Figure 1 encoding calf intestinal alkaline phosphatase; 

(b) substantially the same nucleotide 
sequence as the sequence shown in Figure 1, encoding calf 
intestinal alkaline phosphatase; or 

(c) a nucleotide sequence encoding an 
active fragment of a c£Llf intestinal alkaline phosphatase 
encoded by a portion of a nucleotide sequence of (a) or 
(b). 

2. The isolated nucleic acid of claim 1, 
wherein the nucleotide sequence is the coding sequence 
shown in Figure 1. 

3. An isolated nucleic acid sequence, 
15 con^urising a nucleotide sequence encoding the amino acid 

sequence of calf intestinal alkaline phosphatase of 
Figure 1. 

4. The nucleic acid of claim 1 wherein the 
nucleic acid is cDNA. 

20 5. An isolated RHA molecule encoding the 

amino acid sequence of calf intestinal alkaline 
phosphatase of Figure 1, or an active fragment of the 
calf intestinal alkaline phosphatase of Figure 1. 

6. The isolated nucleic acid of claim 1, 
25 further comprising a second nucleotide sequence encoding 

a polypeptide having specific reactivity with a ligand. 

7. A vector co&^rising the isolated nucleic 
acid of claim 1« 



5 



10 
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8. The vector of claim 7, wherein the vector 
is a plasmid. 



9. A host cell comprising the vector of claim 

7. 



5 10. A recombinant polypeptide produced by the 

host cell of claim 9. 

11. A method of obtaining recombinant calf 
intestinal alkaline phosphatase or an active fragment 
thereof, comprising culturing the host cell of claim 9 

10 and isolating the calf intestinal alkaline phosphatase or 
active fragment thereof from the culture. 

12. A cell culture comprising the host cell of 
claim 9 £md a suitable medium. 

13. A nucleic acid probe comprising a 

15 nucleotide sequence complementary to a portion of the 
nucleotide sequence of the coding region of the 
shown in Figure 1. 

14. A composition comprising recombinant calf 
intestinal alkaline phosphatase or an active fragment 

20 thereof attached to a reagent specifically reactive to a 
ligand to be detected. 

15. The composition of claim 14, wherein the 
alkaline phosphatase or an active fragment thereof 
attached to a reagent comprises a multifunctional 

25 polypeptide . 

16 . The composition of claim 14, wherein the 
alkaline phosphatase or an active fragment thereof 
chemically coupled to the reagent. 



xs 
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17, The composl'txon of any of claiias 14-16^ 
wherein the reagent comprises an antibody or a reactive 




18. The composition of claim 17, wherein the 
5 reagent has specific reactivity with a cancer marker, 

or drug* 

19. The composition of claim 17, wherein the 
reagent has specific reactivity with a nucleic acid. 

20. A method for determining the presence of a 
10 ligand in a sample, comprising: 

(a) contacting the sample with recombinant 
calf intestinal alkaline phosphatase or an active 
fragment thereof, wherein the recombinant calf intestinal 
alkaline phosphatase or an active fragment is attached to 

M 

15 a reagent specifically reactive with said ligand; 

(b) contacting the sample with a detectable 
agent catalyzed by calf intestinal alkaline phosphatase; 
and 

(c) detecting the binding of the sample to the 
20 reagent, wherein binding indicates the presence of said 

ligand in the sample. 

21. The method of claim 20, further comprising 

the step of: 

(d) relating the amount of binding to the 

25 concentration of the ligand. 

22. The method of claim 20, wherein the 
reagent is an anti-ligand antibody. 

23. The method of claim 20, whereiii the 
reagent and recombinant calf intestinal alkaline 

30 phosphatase or active fragment thereof are attached as a 
multifunctional polypeptide • 
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24. The method of claim 20, wherein the 
reagent specifically reacts with an oligonucleotide. 

25. The method of claim 24, wherein the 
reagent specifically reacts with a cDMA or genomic DNA 
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STATEMEI^ UNDER ARUCLE 19 



Amended claims 1-3 and 5 find support on page 5. 
Amended claim 13 finds support on page 8, lines 25-33. 
Amended claims 14-18 find support on page 8, lines 8-9 and 20 
23, page 9, lines 22-25, and page 10, lines 15-22. Amended 
claims 20 and 21 find support on page 9, lines 11-34. 

Other amendments, such as replacing "said" with — 

the — , are clerical in nature. 
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A*GCTTTa«: CTTCTCTGAA AAO^GftaaSA CAOTCCTCAG CCC«GTCCT CACCCTICCT 
ACCTCCCTGC CTGMJ6CCCA GGOVATCaTC TGGTGGCGTG TCACCTCCCT CTGTCCOITG 
AGTTCCACTA GATOTGGCCC TCAa«UUUU. GGGCXTCCCT GTTGGCTCAG CTGOTAAAGA 
ATCCTCCAGC AATGiaGGaS ACCTGGGTXC GATCCCTGGG TTGGGAGGAT 
AGGG&AXGGC TACCC&CTCC AGTATTCTTG CCTGGMIAAT 
GCaGGCTGCA GACCAT^ OaG^AAfi^T OUSACATGAC TGAfiCAACTA AGCACAAXAT 
TCCACTGGJ^ AOaiTCATACT TTGTTCATCC ATTTGTCTGC T«STGGATGGT TGAGTGGCTT 
GTGCCTCTTG GCTACTGT6A GTAATCCTAC TAAAMGTGA GTGTGCAAAT ACCTCTTATA 
GMCTTGATT TI»ATTATTG GGSMiVCACA CCCAGAAGGC GGATTGTTGG ATGTGaGaAT 
GCCTTTTTGA ACCCCAACCT GGGGTTACTG AAi«:cCTAGC TCCTOUlTauS AACCTGTTCC 
TGTCaCTGTG TGTGGCCTCT GGAGAGAAGA GACTCACCTC TGCCTTCCM TTACCTCTCC 
AATGGAGCAG AGGTTGC&AA CTTCAGXXAA TGGGCACTGG GCCCAC6CCT GTCOUZCCGT 
"CaCGCACC TTACACACAC ACACACACAC ACACiUa«:AC ACAAACAGCA CTGCAGACCC 
AfiCTCTTCAG TAACTGAAGA CACAGACAAG GCCCCCGCTC TGCTGXCACC TCCAGTCCCA 
TCCTTCTCCA CAGCAGAiVfiC TGGGCCCAGG CTCCCATGTG CCCCCACTAG CCCAGTGCCC 
ACACCXCCTG CCCAGGTCAA GTCTGGTGaG GftGCTGftGCA GGGGGCAGGG 
TCCCCGTGGA TCTCT6TCTC AGGGCGCCAG GGaACTAACC CAGGCCCCTG 



&CT6GSJUU:C AiUkCCAGGCC AAGGCTGAGT CTCAGA&AAC ACTGAACACG 
A«5aGATG6XT CXCCCACAGG ACTTGGTGAG CAOaGGGCTG GGAGGAGCCT 
«6TCAfiG&C CTXGAAAACG TTCCTCAGGC CTAGACATCX G«UXCX^T CCC«CCCCA 
CCCTGAGGAG ACAGCTGGGA CCATCCXGGG AGGGftGGGAC CTGAATCCTC AGGACCCCXA 
CTGCXAAGCC ACACCCACCA CAXSCCCCXG GCAACAGCGC TCAAAfiXCAX AGGGCAGG«5 
A«GGGCAGGG XGXGGCCACC CGGGGAACCX GGGATGGACA AGGAGACXXX AAXAGCAGGG 
ACAJU^GXCXA XCXAGATTXA AGCCOUSCAG GCCAAGCTGC AGCCGGXCCC XGGXGXCCCA 
GCCXXGCCCX GAGACCCGGC CXCCCCftGGX CCCAXCCXGA CCCXCTGCCA XCACACAGCC 
Sa ?S ^ r° <='^ CAG CXC 

^ y iixa cys val Leu Leu Leu Leu Gly Leu His Leu Gin Leu 

10 25 



If ^ f !^ ^ ^ CCA <5 GXAAXCAGGC 

Ser Leu Gly Leu val pro w**"**^. 

20 



GGCXCCCAGC AGCCCCTACT 



ClUyMKSGGCG GCXCXAGGCX GACCXGACCA ACACXCTCCC CXTG^ XX GAG 

Val Glu 
35 40 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1548 



1597 



1651 



1699 
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Figure IB 



GTG GCT AAG AAG CTG CAG CCC ATC CA6 ACyi GCC GCC AAG AAT GTC MJC 

Val Ala tys Lys Leu Gin Fro He Gin Thr Ala Ala Lya Asn Val He 

45 50 



CTC TTC TTG GGG GAT G GIGAGTACAT 
Zieu Phe Leu Gly Aap 

60 



GA66CCAGCC CACCCCCTGT 



CCCCTGACAG GCCTGGAACC CT6TCATGCC GGCTGACCCA GGTTGGCCCC AGAAACTCGG 
ACCT6AGACA CTGTGTACCT TCAG GG ATG GGG GTG CCT ACG GTG ACA GCC 

Gly Met Gly Val Pro Thr Val Thr Ala 

65 70 

Th3 f ^ f ^ GGC AAA CTG GGA CCT GAG ACA 

Thr Arg He Leu Lys Gly Gin Met Asn Gly Lys Leu Gly Pro Glu Thr 

75 80 8S 

CCC CTG GCC ATG GAC CAG TTC CCA TAC GTG GCT CTG TCC AAG 

Pro Leu Ala Met Asp Gin Phe Pro Tyr Val Ala Leu Ser Lvs 

90 95 100 

GXaAGGCCAA GTGGCCTCAG GGTGGTCTAC ACCASftGGGG TGGGTGTGGG CCTAGGGAGC 

AGGGTAGGaG GGAAACCCAG GAGGGCTAGG GGCTGAGATA GGGGCTGGGG 6CTGTGAGGA 

TGG6CCCAGG GCTGGGTCAG GftSCTGGGTG TCTACCCAGC AGAGCGTAAG GCATCTCTGT 

CCCAG ACA TAC AAC GTG GAC AGA CAG GTG CCA GAC AGC GCA GGC ACT 

Thr Tyr Asn val Asp Arg Gin Val Pro Asp ser Ala Gly Thr 

105 110 

GCC ACT GCC TAC CTG TGT GGG GTC AAG GGC AAC TAC ASA ACC ATT GST 

Ala Thr Ala Tyr Leu cys Gly Val Lys Gly Asn ^ ?S 2S 

120 125 130 

?at r f?^ f ?f f V "^"^ AAC CAG TGC AAA ACG ACA CGT GGG AAT 

Val ser Ala Ala Ala Arg Tyr Asn Gin cys Lys Thr Thr Arg Gly Asn 

135 140 ' 14I 

^« «If ^^"^ f,''? ^''^ AAC CGG GCC AAG AAA GCA G GT6GGCTTGG 

Glu val Thr ser val Met Asn Arg Ala Lys Lys Ala ^^-^^ 



GCGTCAGCTT CCT6GGCAGG 6ACGGQCTCA G»«acCTCAG TGGCCCACCG TGACCTCTGC 
CACCCTCAfi GG AAG TCC GTG GGA GTG GTG ACC ACC ACC AGG GTG CAG 

Gly Lys ser Val Gly val Val Thr Thr Thr Arg val Gin 
l*** 165 170 

CAT GCC TCC CCA GCC GGG GCC TAC GCG CAC ACG GTG AAC CGA AAC TGG 

Hxs Ala ser Pro Ala Gly Ala Tyr Ala His Thr Val iJn 

180 185 

TAC TCA GAC GCC GAC CTG CCT GCT GAT GCA CAG ATG AAT GGC TGC CAG 

Tyr ser Asp Ala Asp Leu Pro Ala Asp Ala Gin Met Asn Gly Cys Gin 

195 200 

Sn T^l f ?f ^f" ^^^^ ^ ^ GAC C5TGCGACATG 

205 210 215 

TTCGGCACAG GGCGGGGCTG GGCACAGGTG GTGG6GCACA CTCGCAACAC AGTCGTAGGT 
AACCTCCAGC CTGCGGTGTT TCAGGGTTTT CATGGGTTTG TGTGTGTGTG TATGTGTGGT 
GGGGTGGCAC CATGTAGGAG GTGGGGACAG GCCTTTCCCA CAGACCTGGT GGGGGAGGTA 



1747 



1793 



1853 
1903 



1951 



1993 



2053 
2113 
2173 
2220 



2268 



2316 



2363 



2423 
2470 



2518 



2566 



2615 



2675 
2735 
2795 
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Figure ic 



3J/i 



GaaafiGaGTA aagggccagc c&ggccccta acccacctgc ctaactctct 

GGCTCCAG ;S S ^ ^ =^ ^ ^ CCT GTG GGG 

Val lie I.eu Gly Gly Gly Arg Lya Tyr Met Phe Pro val Gly 

225 230 

S So S?S »^ f"^ '^'^ CGG 

Tnr pro Asp Pro Glu Tyr pro Asp Asp Ala ser val Asn Gly Val Arg 

235 240 ' 245 

AAG CGA AAG CAG AAC CTG GTG CAG GCA TG6 CAS cce aa/- ^y.r, 

l-ye Arg Lys Gin Aan l^u Val Gin SS £J JJ? Ss 

255 260 

6TAAIGGGGG CTCACGGAT6 TGGGGGTACA GTGGGGCTGG GCCTGGGGTG TCGGCTATGG 

CTG&GGCCT6 GTTCTGCCCT CCCAG GGA GCC CAG TAT GTG TGG AAC CGC ACT 

Gly Ala Gin Tyr val Trp Asn Arg Thr 

265 270 

Sa 2S J" S° f f = f ^ «TA ACA CAC CTC ATG G 

Leu Gin Ala Ala Asp Asp ser ser val Thr His l^u Met 
275 280 285 

GIAACGACTC CACCCACCCT CACTGTCCTC CCCAGGAATG GGTGCCATGG 6CCACCCCTG 

TCCTCAGCTT GAGGGTCACC ACTGCTCCCC TTTCCCACAG GC CTC TTT GAG CCG 

Gly Leu Phe Glu Pro 

290 

S Sp 2S ^ ^ 5!? St« «^ ^« ^ CCG ACC 

Asp Her iya Tyr Asn Val Gin Gin Asp ais Thr Lys Asp Pro Thr 

300 

S SS IS ^ S; 5S 2= 25 ?S 

310 -a?c ^ ^9 Asn Pro 

J^S ^ S 2S ?2 as = «=<=TT»OT 

325 

GftACAGAGGT GTGAT6AGGG CCMJCAGGGT GGGTTTGGTA TCTTATATGT GACTTATCTG 

gS S5 ^ Sp ^= ^ 

jr y ixe Asp His Gly His His Asp Asp Lys Ala Tyr Met 

335 340 ' 

355 

385 390 
TOT G GTAAGCCCAG GGAGAGTGGC AG6TCGTTGC CCCTAAGTTA CGAGGCACAA 

CTCGTCTGAG CCAGTTCCTC TATCTGTCTA GTGGGGTAGT 

GATT6TCACT GACAGACAGA CTGGCCATGG CTCTGCACAC AGGGGAGCAC 



2905 



2953 



2998 



3058 
3110 



3156 



3216 
3270 



3318 



3366 



3411 



3471 
3518 



3566 



3614 



3662 



3716 

3776 
3836 
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Figure ID V/// 

AaGCXaGSTC. AfiTGTGaiCA CGGGGTCCCC TCTTCCCTGlk AG GT CTG GCC CCC 3889 

Gly l.eu Ala Pro 
395 

AGC AAG GCC TTA GAC AGC AAG TCC TAC ACC TCC ATC CTC TAT GGC AAT 3937 

" ^^"^ JJJ ser lie Leu Tyr Gly Asn 

QOC CCA GGC TAT GCG CTT GGC GGG GGC TCG AGG CCC GAT GTT AAT GAC 3985 

Gly Pro Gly Tyr Ala Leu Gly Gly Gly ser Arg Pro Asp vS Asn AsS 
"5 420 425 430 

S tS t2 ° ^^^^^'^^S TAGGXGGGGC GCTGGGAGGT GGGGACCCTQ 4035 

GCCAGAAATT GTGGGGaxSGG GAAGGCT6CC TCCCTTGTCA CATIAACTTC CCTTCTTCTG 4095 
CCCAG AG GAC CCC TCG TAC CA6 CA6 CAG GCG GCC GT6 CCC CAG GCT 4141 

Asp Pro ser Tyr Gin Gin Gin Ala Ala Val Pro Gin Ala 
435 440 445 

^ ^ ^ *^ *^ ®^ <5TG TTC GCG CGC GGC CCG 4189 

ser Glu Thr His Gly Gly Glu Asp val Ala Val Phe Ala Arg Gly Pro 

455 460 

^ f?f 3^ ^ ^"^^ <^ GAG ACC TTC GTG GCG CAC 4237 

Ala His Leu val His Gly val Glu Glu Glu Thr Phe Val Ala Ss 

470 475 

MC ATG GCC TTT GCG GGC TGC GTG GAG CCC TAC ACC GAC TGC AAT CTG 4285 

lie Met Ala. Phe Ala Gly Cys Val Glu Pro Tyr Thr Asp cys Asn Su 

485 490 495 

ttA GCC CCC ACC ACC GCC ACC AGC ATC CCC GAC GCC GCG CAC CTG GCG 4333 

Pro Ala Pro Thr Thr Ala Thr ser He Pro Asp Ala Ala His 2S Sa 

500 505 510 

Hi Pro fl^ "° GCT GGG GCG ATG CTG CTG CTG 4381 

Aia ser Pro Pro Pro Leu Ala Leu Leu Ala Gly Ala Hat Leu Leu Leu 

515 520 525 

^ '^'^ TAACCCCCAC CAGTTCCAGG TCTCGGGATT 4429 

leu Ala Pro Thr Leu Tyr «<»<i» 
530 



XCCU6CTCTC 


CTGCCCAAAA 


CCTCCCAGCT 


CAGGCCCTAC 


CGGAGCTACC 


ACCTCAGAGT 


4489 


CCCCACCCCG 


AAGTGCTATC 


CTAGCTGCCA 


CTCCTGCAGA 


CCCGACCCGG 


CCCCACCACC 


4549 


ACSAiSTTTCAC 


CTCCCAGCAG 


TG21TTCACAT 


TCCAGCATTG 


AAGGAGCCTC 


AGCTAACAGC 


4609 


CCTTCAAGGC 


CCAGCCTATA 


CCGGAGGCTG 


AGGCTCZGAT 


TTCCCTGTGA 


CAGGC6TAGA 


4669 


CCTACTGCCC 


GACCCCAACT 


TCGG7GGCTT 


GGGATTTTGT 


GTTCTGCCAC 


CCTGA&CCTC 


4729 


A5TAAGQGGG 


CTCGGACCAT 


CCAGACTGCC 


CCTACTGCCC 


ACAGCCCACC 


TGAGGACAAA 


4789 


6CTGGCACGG 


TCCCAGGGGT 


CCCAGGCCCG 


GCTGGAACCC 


ACACCTTGCC 


TTCAGCGACC 


4849 


SGGACTCTGG 


GTTCGGAGAG 


TGGCTTCGGG 


AGGCGTGGTT 


TCCGATGGGC 


GTGCTCTGGA 


4909 


ACGTGCTCGC 


CTGAACCAAC 


CTGTGTACAC 


TGGCCAGGAA 


TCACGGCCAC 


CAGAGCTCGG 


4969 


ADCTG&CAGA 


GCCCTCAGCA 


GCCCCTCCTA 


GACCAACGTA 


CCCATTACAG 


AGAGGAGACA 


5029 


GKAGACAiGA 


GGAGAGGAGA 


CTTGTCCCAG 


GTCCCTCAGC 


TGCT6TGAGG 


GCGGCCCTGG 


5089 
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TQCCCCTXCC AGGCTGGGCA TCCCAGTAGC AGCMGGGAC CCGGGGGTGG 6G1.CACAGGC 5149 
CCCGCCCTCC CTGGGAfiGCA GGAAGCAGCT CTCAAM^AA CTGTTCTAAG TATGATACAG 5209 

TGTCTSaaCA GAAGCCCTTA G6TGGGGGCA CAGAGTGTCT GGGTGAGGGG 5269 
08TCACGGTC ACATOUSGAG GTTftGGGaGG GGTTGATGAA GGGCTGAC6T TGAGCAAAGA 5329 

CTOUaUkGGA CA6TGGTGCA GGACTGGGTG TGGTCAGC&6 GGGGACTGGT 5389 

5399 
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